Search Results: "andres"

21 October 2013

Julian Andres Klode: python-apt 0.9 released

I released python-apt 0.9. This completely removes support for the old API from the code base (it was disabled for the entirety of 0.8 in Debian, and in Ubuntu since saucy). Highlights:

Cleanup: Complete removal of old-api support code
Bug fix: Various coverty bug fixes by Michael Vogt
Bug fix: Correctly handles multi-arch dependencies in apt.debfile, so packagekit and gdebi can now install local multi-arch packages correctly
Bug fix: A segmentation fault has been fixed. When releasing the value of the policy attribute of an apt_pkg.Cache object, its destructor deleted the pkgPolicy, but that was managed by a CacheFile from APT, causing it to be deleted twice.
Bug fix: Tests do not depend on the contents of /tmp anymore
Bug fix: All examples and old tests have been updated to the current python-apt API
Feature: Paths can now be specified using bytes objects instead of str in Python 3.
Ubuntu-specific: Meta-data for Ubuntu 14.04 although with a typo ( thar instead of tahr ), but that is fixed in git

Filed under: Debian

7 September 2013

Julian Andres Klode: Review: ThinkPad X230

This week, a few hours after Lenovo announced the X240, I bought an X230. Normally, the X230 model I bought comes with a Core i5-3320M CPU, 4GB RAM, 500GB HDD. My model was a special set including a second 4GB RAM stick and a 128 GB mSATA Plextor SSD. It came without Windows; and the ThinkVantage button is black instead of blue and has no label. I put a fresh installation of unstable on it and tweaked it to save more power when on battery (enabled RC6++, and enabled autosuspend and other powertop suggestions with a script in /etc/pm/power.d); and configured hdparm.conf to put my hard disk into standby after 5 seconds (it s only used for big data anyway, so most of the time it is unused). It now consumes 5W in idle with minimum brightness, and 8-10W with brightness 13 of 15. Consumption when surfing is 10 15 watts. Booting from grub to gdm is fast, I did not measure it, but it probably took about 5 seconds. The IPS screen looks really good. Much much much better than the screen in my 2010 Edge 15 (I reviewed that one in 2010). It seems a bit more glossy than that one, but still matte. Keyboard is good as well. The touch pad is crap however. All hardware seems to work. Comparison to the X240 for others who think about buying one of them: The X240 is lighter and thinner (it s an Ultrabook) and has an optional FullHD 12.5 inch screen. It also offers a much larger touchpad and palm rest. But compared to the X230, the X240 lacks many things: No dedicated buttons for the trackpoint (you need to use the click-pad), it s limited at 8GB RAM, uses a slower low-voltage Haswell CPU, and it uses the new M.2 connector (probably supporting only the shortest cards) instead of mini-PCIe, so it s not really possible to add an additional SSD currently; as M.2 SSDs do not seem to be available yet. I also do not know whether the X240 offers ThinkLight or LEDs for hard drive activity and similar.
Filed under: General

3 August 2013

Benjamin Mako Hill: Doctor of Philosophy

On Wednesday, I successfully defended my PhD dissertation in front of a ridiculously packed house at the MIT Media Lab. I am humbled by the support shown by the MIT Sloan, Media Lab, and Harvard communities. Earlier today, I finished up paperwork and submitted my archival copies. I m done. Although I ve often heard PhDs described as emotional roller coasters, I feel enormously blessed in that I honestly can t relate. My eight years at MIT and Harvard have been almost universally positive and I have learned and grown indescribably. As excited as I am about my next chapter at the University of Washington, I m going to miss my life here. Deeply. My dissertation was three essays on volunteer mobilization in peer production. Once I have a chance to catch up and recover, I ll be posting the previously unpublished pieces. The Remixing Dilemma was included in the dissertation and is already online. The Media Lab AV team shot professional video of the talk. When I get a copy of the video, I ll post that too. But because I think it s important, I ve formatted and published the acknowledgments section of the dissertation today. Although there are too many folks to thank, I ve highlighted the contributions of my co-authors, and friends, Aaron Shaw and Andr s Monroy Hern ndez and my almost unbelievably incredible group of advisors: Eric von Hippel, Yochai Benkler, Mitch Resnick, and Tom Malone.

9 May 2013

Benjamin Mako Hill: The Remixing Dilemma: The Trade-off Between Generativity and Originality

This post was written with Andr s Monroy-Hern ndez. It is a summary of a paper just published in American Behavioral Scientist. You can also read the full paper: The remixing dilemma: The trade-off between generativity and originality. It is part of a series of papers I have written with Monroy-Hern ndez using data from Scratch. You can find the others on my academic website.

Remixing the reworking and recombination of existing creative artifacts represents a widespread, important, and controversial form of social creativity online. Proponents of remix culture often speak of remixing in terms of rich ecosystems where creative works are novel and highly generative. However, examples like this can be difficult to find. Although there is a steady stream of media being shared freely on the web, only a tiny fraction of these projects are remixed even once. On top of this, many remixes are not very different from the works they are built upon. Why is some content more attractive to remixers? Why are some projects remixed in deeper and more transformative ways?

We try to shed light on both of these questions using data from Scratch a large online remixing community. Although we find support for several popular theories, we also present evidence in support of a persistent trade-off that has broad practical and theoretical implications. In what we call the remixing dilemma, we suggest that characteristics of projects that are associated with higher rates of remixing are also associated with simpler and less transformative types of derivatives.

Our study is focused on two interrelated research questions. First, we ask why some projects shared in remixing communities are more or less generative than others. Generativity a term we borrow from Jonathan Zittrain describes creative works that are likely to inspire follow-on work. Several scholars have offered suggestions for why some creative works might be more generative than others. We focus on three central theories:

Projects that are moderately complicated are more generative. The free and open source software motto release early and release often suggests that simple projects will offer more obvious opportunities for contribution than more polished projects. That said, projects that are extremely simple (e.g., completely blank slates) may also uninspiring to would-be contributors.
Projects by prominent creators are more generative. The reasoning for this claim comes from the suggestion that remixing can act as a form of cultural conversation and that the work of popular creators can act like a common medium or language.
Projects that are remixes themselves are more generative. The reasoning for this final claim comes from the idea that remixing thrives through the accumulation of contributions from groups of people building on each other s work.

Our second question focuses on the originality of remixes and asks when more or less transformative remixing occurs. For example, highly generative projects may be less exciting if the projects produced based on them are all near-identical copies of antecedent projects. For a series of reasons including the fact that increased generativity might come by attracting less interested, skilled, or motivated individuals we suggest that each of the factors associated with generativity will also be associated with less original forms of remixing. We call this trade-off the remixing dilemma. We answer both of our research questions using a detailed dataset from Scratch, where young people build, share, and collaborate on interactive animations and video games. The community was built to support users of the Scratch programming environment, a desktop application with functionality similar to Flash created by the Lifelong Kindergarten Group at the MIT Media Lab. Scratch is designed to allow users to build projects by integrating images, music, sound, and other media with programming code. Scratch is used by more than a million users, most of them under 18 years old. To test our three theories about generativity, we measure whether or not, as well as how many times, Scratch projects were remixed in a dataset that includes every shared project. Although Scratch is designed as a remixing community, only around one tenth of all Scratch projects are ever remixed. Because more popular projects are remixed more frequently simply because of exposure, we control for the number of times each project is viewed. Our analysis shows at least some support for all three theories of generativity described above. (1) Projects with moderate amounts of code are remixed more often than either very simple or very complex projects. (2) Projects by more prominent creators are more generative. (3) Remixes are more likely to attract remixers than de novo projects. To test our theory that there is a trade-off between generativity and originality, we build a dataset that includes every Scratch remix and its antecedent. For each pair, we construct a measure of originality by comparing the remix to its antecedent and computing an edit distance (a concept we borrow from software engineering) to determine how much the projects differ. We find strong evidence of a trade-off: (1) Projects of moderate complexity are remixed more lightly than more complicated projects. (2) Projects by more prominent creators tend to be remixed in less transformative ways. (3) Cumulative remixing tends to be associated with shallower and less transformative derivatives. That said, our support for (1) is qualified in that we do not find evidence of the increased originality for the simplest projects as our theory predicted.

Two plots of estimated values for prototypical projects. Panel 1 (left) display predicted probabilities of being remixed. Panel 2 (right) display predicted edit distances. Both panels show predicted values for both remixes and de novo projects from 0 to 1,204 blocks (99th percentile).

Two plots of estimated values for prototypical projects. Panel 1 (left) displays predicted probabilities of being remixed. Panel 2 (right) displays predicted edit distances. Both panels show predicted values for both remixes and de novo projects from 0 to 1,204 blocks (99th percentile).

We feel that our results raise difficult but important challenges, especially for the designers of social media systems. For example, many social media sites track and display user prominence with leaderboards or lists of aggregate views. This technique may lead to increased generativity by emphasizing and highlighting creator prominence. That said, it may also lead to a decrease in originality of the remixes elicited. Our results regarding the relationship of complexity to generativity and originality of remixes suggest that supporting increased complexity, at least for most projects, may have fewer drawbacks. As supporters and advocates of remixing, we feel that although highly generative works that lead to highly original derivatives may be rare and difficult for system designers to support, understanding remixing dynamics and encouraging these rare projects remain a worthwhile and important goal. Benjamin Mako Hill, Massachusetts Institute of Technology
Andr s Monroy-Hern ndez, Microsoft Research For more, see our full paper, The remixing dilemma: The trade-off between generativity and originality. Published in American Behavioral Scientist. 57-5, Pp. 643 663. (Official Link, Pay-Walled ).

10 April 2013

Julian Andres Klode: apt-show-versions rewrite in C++ (more than 10 times faster)

The script apt-show-versions is developed by another Debian Developer called Christoph Martin in Perl. Recently, it turned out that apt-show-versions is too slow for some users; so I decided to rewrite his program using APT s C++ API. I expect this to be part of a future APT release, rendering the original apt-show-versions obsolete. The rewrite is sadly not 100% backwards compatible to the original version; as some option names had to be renamed due to our command-line parser not supporting option names like -nh, and some other options were dropped because they are hard to support (like status-file and lists-dir) with our command-line parsing. I also decided not to keep the the -p and -r options, but use the standard APT command-line conventions insteads. For now, it also cannot show you the distribution names you have specified in your sources.list file, but will always display codenames instead; if available. I hope to fix this in Jessie by extending APT s cache format a bit. On the performance side, this program now takes about 0.09s compared to the 1.40 seconds needed by apt-show-versions. The times are taken with all data in caches. The current version can be found in a git repository, a link to gitweb is: http://anonscm.debian.org/gitweb/?p=users/jak/apt-show-versions.git
Please also note that support for allversions is not 100% implemented yet, but it should work for most uses. Now, go testing and report back!
Filed under: Debian

11 January 2013

Julian Andres Klode: Recursive-descent in Python, using generators

Writing recursive descent parsers is easy, especially in Python (just like everything is easy in Python). But Python defines a low limit on the number of recursive calls that can be made, and recursive descent parsers are prone to exceed this limit. We should thus write our parser without real recursion. Fortunately, Python offers us a way out: Coroutines, introduced in Python 2.5 as per PEP 342. Using coroutines and a simple trampoline function, we can convert every mutually recursive set of functions into a set of coroutines that require constant stack space. Let s say our parser looks like this (tokens being an iterator over characters):

def parse(tokens):
    token = next(tokens)
    if token.isalnum():
        return token
    elif token == '(':
         result = parse(tokens)
         if next(tokens) != ')': raise ...
         return result
    else:
         raise ...

We now apply the following transformations:
return X => yield (X)
parse() => (yield parse()) That is, we yield the value we want to return, and we yield a generator whenever we want to call (using a yield expression). Our rewritten parser reads:

def parse(tokens):
    token = next(tokens)
    if token.isalnum():
        yield token
    elif token == '(':
         result = yield parse(tokens)
         if next(tokens) != ')': raise ...
         return result
    else:
         raise ...

We obviously cannot call that generator like the previous function. What we need to introduce is a trampoline. A trampoline is a function that manages that yielded generators, calls them, and passes their result upwards. It looks like this:

def trampoline(generator):
    """Execute the generator using trampolining. """
    queue = collections.deque()
    def schedule(coroutine, stack=(), value=None):
        def resume():
            if 0:
                global prev
                now = stack_len(stack)
                if now < prev:
                    print("Length", now)
                    prev = -1
                elif prev != -1:
                    prev = now
            result = coroutine.send(value)
            if isinstance(result, types.GeneratorType):     # Normal call
                schedule(result, (coroutine, stack))
            elif isinstance(result, Tail):                  # Tail call (if you want to)
                schedule(result.value, stack)
            elif stack:                                     # Return to parent
                schedule(stack[0], stack[1], result)
            else:                                           # Final Return
                return result
        queue.append(resume)
    schedule(generator)
    result = None
    while queue:
        func = queue.popleft()
        result = func()
    return result

This function is based on the code in PEP 342, the difference being that

we do not correctly propagate exceptions through the stack, but directly unwind to the caller of the parser (we don t handle exceptions inside our parser generators anyway)
the code actually compiles (code in PEP used value = coroutine.send(value) which does not work)
the code returns a value (code in PEP was missing a return in schedule)
we don t use a class, and allow only one function to be scheduled at once (yes, we could get rid of the queue)
we allow tail calls [where Tail = namedtuple("Tail", ["result"])] to save some more memory.

For a more generic version of that, you might want to re-add exception passing, but the exceptions will then have long tracebacks, so I m not sure how well they will work if you have deep recursion. Now, the advantage of that is that our parser now requires constant stack space, because the actual real stack is stored in the heap using tuples which are used like singly-linked lists in scheme here. So, the only recursion limit is available memory. A disadvantage of that transformation is that the code will run slightly slower for small inputs that could be handled using a normally recursive parser.
Filed under: Python

31 October 2012

Benjamin Mako Hill: Time to Boot

Last weekend, my friend Andr s Monroy-Hern ndez pointed out something that I've been noticing as well. Although the last decade has seen a huge decrease in the time of it takes to boot, the same can not be said for the increasing powerful computer in my pocket that is my phone. Graph showing increasing boot-times for phones and decreasing boot-times for laptops.

Graph showing increasing boot-times for phones and decreasing boot-times for laptops.

As the graph indicates, I think my cross-over was around 2010 when I acquired an SSD for my laptop.

16 August 2012

Julian Andres Klode: Cleaning up the system with pseudo-boolean optimization

You can use a PBO solver to clean up your system from unneeded automatically installed packages. First of all, you convert the system state to PB, and add an optimization function telling it to remove as many automatically installed packages as possible. Then you run this thing through a solver (such as clasp, which seems the fastest solver for PBO instances in the Debian archive) and convert its output to human-readable package names. Code is provided at http://anonscm.debian.org/gitweb/?p=users/jak/cleanup.git, under the MPL 2.0. You need to have python-apt and clasp installed to use it. There is potential minisat+ support, but it s currently a bit broken. To use, run python program_builder.py, and it will tell you which packages are no longer needed on your system. It ignores Suggests, if you want those in, you have to hack the code and replace Recommends by Recommends , Suggests . You can also turn of such dependencies by setting Program.hard_softdeps to False.
Filed under: APT2, Debian, Python

11 August 2012

Julian Andres Klode: Implicit preferences in OR dependencies

Debian packages commonly use or dependencies of the form a b to mean that a or b should be installed, while preferring option a over b. In general, for resolving an or dependency, we will try all options from the left to the right, preferring the left-most option. We also prefer real packages over virtual ones. If one of the alternatives is already installed we use that.

def solve_or(or):
  best_real = None
  best_virtual = None
  for dep in or:
     for target in dep:
        if target.name == dep.name and best_real is None:
           best_real = target
        if target.name != dep.name and best_virtual is None:
           best_virtual = target        
        if target.is_installed():
          return target
  return best_real if best_real is not None else best_virtual

Now, this way of solving dependencies is slightly problematic. Let us consider a package that depends on: a b, b. APT will likely choose to install a to satisfy the first dependency and b to satisfy the second. I currently have draft code around for a future version of APT that will cause it to later on revert unneeded changes, which means that APT will then only install b . This result closely matches the CUDF solvers and cupt s solver. On the topic of solving algorithms, we also have the problem that optimizing solvers like the ones used with apt-cudf do not respect the order of dependencies, rather choosing to minimise the number of packages installed. This causes such a solver to often do stuff like selecting an sqlite database as backend for some service rather then a larger SQL server, as that installs fewer packages. To make such solvers aware of the implicit preferences, we can introduce a new type of dependency category: Weak conflicts, also known as Recommends-Not. If a package P defines a Recommends-Not dependency against a package Q, then this means that Q should not be installed if P is installed. Now, if we have a dependency like: Depends: a b c we can encode this as: Recommends-Not: c, c, b Causing the solver to prefer a, then b, and then c. This should be representable as a pseudo-boolean optimization problem, as is common for the dependency problem, although I have not looked at that yet it should work by taking the standard representation of conflicts, adding a relaxation variable and then minimising [or maximising] the number of relaxation variables.
Filed under: APT2, Debian, General

23 June 2012

Bernd Zeimetz: Report from the Bug Squashing Party in Salzburg

photo from salzburg-cityguide.com, Copyright (C) Uwe Brandl

Participation and Results From June 15-17th we held a Debian BugSquashingParty in Salzburg, hosted and sponsored by conova communications GmbH. It was a fun and busy weekend, with 15-17 people from 5 countries being around, mainly working on RC bugs in Testing/Unstable. Gerfried Fuchs (rhonda) also worked on triaging the impact of RC bugs on the version in Squeeze, while Peter Palfrader (weasel) took care of Tor related things and Debian sysadmin work, including starting on the new bugs and udd hosts. Phillip Hug (hug) worked on the debian.ch infrastructure. Together with Miroslav Such from Red Hat Bernd Zeimetz (bzed) worked on the packaging of the necessary libraries and daemons to add (basic) Spacewalk client support to Debian. As soon as the packages passed NEW and #677871 was applied (thanks to the APT guys for working on that already), managing Debian clients with Spacewalk should work out of the box. Of course we also had a little keysigning party

Statistics

about 68 bugs in unstable/testing were triaged/patched/fixed or at least pinged

54 bugs were tagged to show if they affect Squeeze, several other bugs were pinged to retrieve necessary information or to trigger an update in the next stable pointrelease.

5 packages were introduced into Debian (still in NEW, though) - the Spacewalk client related packages and libapache2-mod-auth-memcookie.

Accomodation Thanks to Debian funds we were able to provide accomodation for four participants in the JUFA youth hostel in Salzburg. We had paid in advance for eight, but changing to rooms with a higher category for only 4 people would have been equally or more expensive.

Press/Media coverage Additionally to being mentioned in the calendars on ProLinux and similar pages, we had some press coverage by the local newspaper and online magazines:

Scan of Salzburger Woche / Stadt Nachrichten

article on the Salzburger Nachrichten website

meinbezirk.at

Salzburg Cityguide - Photogallery

Fun facts We consumed 2kg of Leberkas, a big plate of "Buchteln mit Vanillesosse", about 16000cm^2 of Pizza, about 80 litres of coke, juice, beer and wine and I guess we drank at least the same amount of water. We had coffee made of 1.5kg coffee beans and managed to empty the (formerly well filled) icemaker in the fridge. Also we had successful training sessions of a standard Debconf game (rules won't be explained here obviously). Maybe we even successfully spread the game to the employees of a commercial linux distribution

10 June 2012

Gregor Herrmann: RC bugs 2012/23

here's the list of RC bugs I've worked on during the last week:

~~#666373~~ src:scribus: "scribus: FTBFS: dh_testroot: You must run this as root (or use fakeroot)."
fix build* targets in debian/rules, upload to DELAYED/2
~~#673864~~ searchandrescue: "searchandrescue: FTBFS in sid: install: cannot stat SearchAndRescue': No such file or directory"
fix build* target dependencies in debian/rules, upload to DELAYED/2, then moved to 0-day on maintainer's request
~~#674212~~ libalkimia: "libalkimia: ftbs missing dependency on pkg-config"
add missing build dependency, upload to DELAYED/2
~~#674353~~ src:libcss-perl: "libcss-perl: FTBFS: tests failed"
recompile grammar at build time, upload to DELAYED/2
~~#674356~~ src:kalzium: "kalzium: FTBFS: make[4]: *** No rule to make target /usr/lib/libQtOpenGL.so', needed by lib/libcompoundviewer.so.4.7.0'. Stop."
manual close
~~#674369~~ src:libtuxcap: "libtuxcap: FTBFS: Could NOT find ImageMagick"
add patch to work with newer cmake ImageMagick module, upload to DELAYED/2
~~#674370~~ src:aolserver4-nsmysql: "aolserver4-nsmysql: FTBFS: ld: cannot find -lssl"
manual close
~~#674376~~ src:sysbench: "sysbench: FTBFS: ld: cannot find -lssl"
manual close
~~#674377~~ src:korundum: "korundum: FTBFS: make[3]: *** No rule to make target /usr/lib/libQtCore.so', needed by lib/korundum4.so'. Stop."
manual close
~~#676051~~ src:biber: "biber: FTBFS: tests failed"
add missing build dependencies (pkg-perl)
~~#676072~~ src:libgeo-osm-tiles-perl: "libgeo-osm-tiles-perl: FTBFS: tests failed"
add patch to skip test that needs internet access (pkg-perl)

31 May 2012

Julian Andres Klode: Memory allocators

In the past days, I looked at various memory allocators in a quest to improve performance in my multi-threaded test cases of a reference counting language runtime / object allocator (a fun project). It turns out the glibc s memory allocator is relatively slow, especially if you do loops that create one element and destroy one element at the same time (for example, map() on a list you no longer use after you passed it to map). To fix this problem, I considered various options. The first option was to add a thread-local cache around malloc(). So whenever we want to free() an object, we place it on a thread-local list instead, and if a malloc() request for an object of that size comes in, we reuse the old object. This fixes the problem with the lists, but experiments shown another problem with the glibc allocator: Allocating many objects without releasing others (let s say appending an element to a functional list). I started testing with tcmalloc instead, and noticed that it was several times faster (reducing run time from 6.75 seconds to 1.33 seconds (5 times faster)). As I do not want to depend on a C++ code base, I decided to write a simple allocator that allocates blocks of memory via mmap(), splits those into objects, and puts the objects on a local free list. That allocator performed faster than tcmalloc(), but was also just a simple test case, and not really useable, due to potential fragmentation problems (and because the code was statically linked in, causing thread-local storage to be considerably faster, as it does not need to call a function on every TLS access, but can rather go through a segment register). I then discovered jemalloc, and tried testing with this. It turned out that jemalloc was even faster than tcmalloc in all test cases, and also required about the same amount of memory as tcmalloc (and slightly more than the custom test code, as that combined reference counting with memory allocator metadata and thus had less meta data overhead). Using jemalloc, the list building test performs in 0.9X seconds now (instead of tcmalloc s 1.33 or glibc s 6.75 seconds), requires 0 major page faults (tcmalloc has 2), and uses 5361472 kb memory (tcmalloc uses 5290240, glibc requires 7866608). Given that jemalloc is written in C, I can only recommend it to everyone in need of a scalable memory allocator. Depending on your workload, it might require less memory than tcmalloc (at least that s the case at Facebook) and is most likely faster than tcmalloc. It also provides tcmalloc-compatible profiling facilities (as Facebook needed them). Furthermore, jemalloc is also the memory allocator used by FreeBSD, and is used by Mozilla projects, and on several Facebook projects, and should thus be relatively stable and useable. All tests were run with 4 threads, on a dual-core laptop with hyper-threading, using (e)glibc 2.13, tcmalloc 2.0, and jemalloc 3.0. The tests may not be representative of real world performance, and do not account for fragmentation. Should I try replacing Python s custom caching of malloc() calls with simply calling jemalloc, and run a few Python benchmarks? Anyone interested in that?
Filed under: General

22 April 2012

Julian Andres Klode: Reference Counting and Tail Calls

One thing I thought about today is reference counting in combination with tail calls. Imagine a function like this:

function f(x)   return g(x+1);

Now consider that x is a reference counted object and that x + 1 creates a new object. The call to g(x + 1) shall be in tail call position. In most reference counted languages, the reference to an argument is owned by the caller. That is, f() owns the reference to x + 1. In that case, the call to g() would no longer be in a tail call position, as we still have to decrease the reference count after g() exits. An alternative would be that the callee owns the reference. This however, will most likely create far more reference count changes than a caller-owns language (increase reference count in caller, decrease reference count in callee). For example, the following function requires an increment before the call to g().

function f1(x)   return g(x);

Does anyone have any ideas on how to solve the problem of tail calls while avoiding the callee-owns scenario? Something that does not require a (non-reference-counting) garbage collector would be preferably.
Filed under: General

2 April 2012

Julian Andres Klode: Currying in the presence of mutable parameters

In the language introduced yesterday, mutable parameters play the important role of representing the side effects of actions in the type system and thus ensure referential transparency. One of the questions currently unaddressed is how to handle currying in the presence of mutable parameters. In order to visualise this problem, consider a function

    printLn(mutable IO, String line)

If we bind the the first parameter, what should be the type of the function, and especially important how do we get back the mutable parameter? Consider the partially applied form printLn1:

    printLn1(String line)

The mutability parameter would be lost and we could not do any further I/O (and the curried form would appear as a pure function, so a good compiler would not even emit a call to it) An answer might be to put the thing into the result when currying:

    printLn2(String line) -> mutable IO

But how can we explain this? In the end result, do we maybe have to use a signature like:

    printLn(String, mutable in IO io1, mutable out IO io2)

We could then introduce syntax to call that as

    printLn(string, mutable io)

Where as the mutable io argument basically expands to io and io1 for the first call, and for later calls to io1, io2 , and so on. It can also be easily curried by allowing currying to take place on variables not declared as output parameters. We can then curry the function as:

    printLn3(mutable in IO io1, mutable out IO io2)
    printLn4(mutable out IO io2)

If so, we can rename mutable back to unique and make that easier by introducing the unary operator & for two locations, just like Mercury uses ! for it. We could then write calls looking like this:

    printLn("Hello", &io);
    printLn("Hello", io, io1);

How out parameters are explained is a different topic; we could probably say that an out parameter defines a new variable. Another option is to forbid currying of mutable parameters entirely. This would have the advantage of maintaining the somewhat simple one parameter style. The programming language Clean does not provide any special syntactic sugar for having mutable variables. In Clean, the function gets a unique object and returns a unique object (noted by *). For example, the main entry point in a Clean program (with state) looks like this:

    Start:: *World -> *World

In short, the function Start gets a abstract world passed that is unique and at the end returns a new unique world. In Clean syntax, our example function would most likely have the signature:

    printLn :: String *IO -> *IO

You know have to either maintain one additional variable for the new unique object, which gets a bit complicated with time. On the other hand, you can do function composition on this (if you have a function composition operator that preserves uniqueness when available, as should be possible in Clean):

    printTwoLines :: *IO -> *IO
    printTwoLines = (printLn "World") o (printLn "Hello")

Function composition on mutable things however, does not seem like it is needed often enough in a functional programming language with a C syntax. People might also ask why monads are not used instead. Simon L Peyton Jones and Philip Wadler described monadic I/O in their paper Imperative Functional Programming (http://research.microsoft.com/en-us/um/people/simonpj/papers/imperative.ps.Z) in 1993, and it is used in Haskell (the paper was about the implementation in Haskell anyway), one of the world s most popular and successful functional programming languages. While monadic I/O works for the Haskell crowd, and surely some other people, the use of Monads also limits the expressiveness of code, at least as far as I can tell. At least as soon as you want to combine multiple monads, you will have to start lifting stuff from one monad to another (liftIO and friends), or perform all operations in the IO monad, which prevents obvious optimizations (such as parallelizing the creation of two arrays) in short dependencies between actions are more strict than they have to be. For a functional language targeting imperative programmers, the lifting part seems a bit too complicated. One of the big advantages of monads is that they are much easier to implement, as they do not require extensions to the type system and the Hindley-Milner type inference algorithm used by the cool functional programming languages. If you want uniqueness typing however, you need to modify the algorithm or infer the basic types first and then infer uniqueness in a second pass (as Clean seems to do it).
Filed under: General

1 April 2012

Julian Andres Klode: Functional programming language for Python programmers and friends

Just for you, and this time in the Pythonesque rendering.

module main:
    import std (range)
    import std.io (printf, IO)
    # print the Fahrenheit-Celcius table for fahr = 0, 20, ..., 300
    function main(mutable IO io):
        Int lower = 0    # lower bound
        Int upper = 300  # upper bound
        Int step = 20    # step
        for Int fahr in range(lower, upper, step):
            Double celcius = 5 * (fahr - 32) / 9
            std.io.printf(io, "%3d\t%6.1f\n", fahr, celcius)

It does not really look like it, but this language is purely functional. It represents side effects using unique types. If you declare a mutable parameter, you basically declare a unique input parameter and a unique output parameter. I m also giving you a list implementation

module std.container.list:
    ## The standard singly-linked list type
    type List[E]:
        Nil                     ## empty list
        Node:
            E value             ## current value
            List[E] next        ## remaining list

And yes, both languages should be able to be represented using the same abstract syntax tree. The only change is the replacement of the opening curly brace by a colon, the removal of the closing curly bracket and semicolons, the replacement of C-style comments with Python-style comments and the requirement of indentation; oh and the for statement gets a bit lighter as well.
Filed under: General

Julian Andres Klode: [updated] Functional programming language for C programmers and friends

Just for you:

module main  
    import std (range);
    import std.io (printf, IO);
    /* print the Fahrenheit-Celcius table
        for fahr = 0, 20, ..., 300 */
    function main(mutable IO io)  
        Int lower = 0;   // lower bound
        Int upper = 300; // upper bound
        Int step = 20;   // step
        for (Int fahr in range(lower, upper, step))  
            Double celcius = 5 * (fahr - 32) / 9;
            std.io.printf(io, "%3d\t%6.1f\n", fahr, celcius);

module std.container.list  
    /** The standard singly-linked list type */
    type List[E]  
        Nil;                    /** empty list */
        Node  
            E value;            /** current value */
            List[E] next;       /** remaining list */

Thus are only excerpts from a document with tens of pages and the reference implementation of the standard library. The incomplete working draft for the language is attached: JAK Programming Language Early Working Draft (28 pages). Update: Fixed the link.
Filed under: General

3 March 2012

Julian Andres Klode: hardlink 0.2 RC1 released

I have just released version 0.2 RC1 of my hardlink program. Compared to the 0.1.X series, the program has been rewritten in C, as Python was to memory-hungry for people with millions of files. The new program uses almost the same algorithm and has almost completely the same bugs as the old version. The code should be portable to all UNIX-like platforms supporting nftw(). I have tested the code on Debian, FreeBSD 9, and Minix 3.2. For storing path names, it uses a flexible array member on C99 compilers, a zero-length-array on GNU compilers, and a array of length 1 on all other compilers (which should work everywhere as far as I heard). The new version may have slightly different behaviour compared to 0.1.2 when regular expressions are specified on the command-line, as a result of the switch from Python s regular expression engine to PCRE (if compiled with PCRE support, you can also use POSIX extended regular expressions if you like). The code is significantly faster than the old code (in situations where it s not I/O bound), and uses much less memory than the old one. Per file, we now store a <tt>struct stat</tt>, a pointer, an int, a char, and the filename; as well as three pointers for a binary search tree (which uses tsearch()). It should also be compatible to Red Hat s original hardlink tool now command-line-wise, there is at least a -c option now. The history and the name conflict are interesting, but probably nothing for this post. We re even less resistant against changing trees than Fedora s tool (and derived) currently, but should otherwise be better (and far more complicated and feature-packed). And we don t require mmap(), but use fread() instead. There was no real performance difference in testing. And we are not GPL-licensed, but use the MIT/Expat license. The package is currently entering Debian experimental for testing purposes. If you have used hardlink previously, or are just curious, give it try. And the makefile is now compatible with the various BSD makes out there, if that s interesting for you. Link to website: http://jak-linux.org/projects/hardlink/
Filed under: General

24 January 2012

Julian Andres Klode: Managing system package selections using custom meta packages

Over the last years, I have developed a variety of metapackages for managing the package selections of the systems I administrate. The meta packages are organized like this:

jak-standard: Standard packages for all systems
jak-desktop: Standard packages for all desktop systems (GNOME 3 if possible, otherwise GNOME 2)
jak-printing: Print support
jak-devel: Development packages
jak-machine-<X>: The meta package defining the computer X

Each computer has a jak-machine-X package installed. This package is marked as manually installed, all other packages are marked as automatically installed. The machine packages have the attribute XB-Important: yes set in debian/control. This creates an Important: yes field. This field is not official, but APT recognizes it and does not remove those packages (the same field is set for the APT package by APT when building the cache, as APT should not be removed either by APT). It seems to work a bit like Essential, with the exception that non-installed packages are not installed automatically on dist-upgrade. The meta packages are created using seed files similar to Ubuntu. In contrast to Ubuntu, I m not using germinate to create the packages from the seeds, but a custom dh_germinate_lite that simply takes a seed file and creates the correct substvars. It s faster than germinate and really simplistic. It also does not handle Recommends currently. The whole result can be seen on http://anonscm.debian.org/gitweb/?p=users/jak/jak-meta.git. Maybe that s useful for some people. And if you happen to find some packages in the seeds that are deprecated, please let me know. Oh, and yes, some packages (such as the letterman one) are internal software not publically available yet [letterman is a simple GUI for creating letters using LaTeX]. While I m at it, I also built Ubuntu s version of wine1.2 for i386 squeeze. It can be found in
deb http://people.debian.org/~jak/debian/ squeeze main (it still needs a few changes to be correct though, I ll upload a jak2 build soon). I also built updated sun-java6 packages for my parents (mostly needed due to the plugin, some websites do not work with the IcedTea one), but can t share the binaries due to licensing requirements. I may push out a source repository, though, so others can build those packages themselves. I ll let you know once that s done.
Filed under: Debian, General

9 December 2011

Julian Andres Klode: Combining ikiwiki and Twitter s bootstrap

Just because Julien did it, I moved my website to almost the same design now. Still using ikiwiki, of course. I m still missing a few things, such as marking the currently active page in the menu, but I hope to get that done as well soon. Go to http://jak-linux.org/ to see it.
Filed under: General

13 November 2011

Benjamin Mako Hill: Famous in Scratch

A few years ago, I ran into my friend Jay in the MIT Infinite Corridor. He was looking for volunteers to have their pictures taken and then added to the library of freely licensed and remixable media that would ship with every version of Scratch -- the graphical programming language built by Mitch Resnick's Lifelong Kindergarten group that is designed to let kids create animations and interactive games. Jay suggested I make some emotive faces and I posed for three images that made the final cut: /copyrighteous/images/scratchlib-mako-laughing.gif

/copyrighteous/images/scratchlib-mako-laughing.gif

/copyrighteous/images/scratchlib-mako-screaming.gif

/copyrighteous/images/scratchlib-mako-stop.gif

But although I've spent quite a bit of time studying the Scratch community in the last few years as it is grown to include millions of participants and projects, I forgot about about Jay's photo shoot. A couple months ago, Acetarium resident Andres Lombana Bermudez pointed out that there was a mako tag on the Scratch website and that a whole bunch of users had been publishing projects using the pictures of me which, apparently, shipped in Scratch under my name. For example, in this project in which I dance in front of a enormous "MAKO" banner: /copyrighteous/images/scratchweb-dance.png

/copyrighteous/images/scratchweb-dance.png

That said, given the rather emotive nature of the pictures, I seem to usually end up being blown up shot, shrunk, set on fire by dragons, or meeting other similarly unfortunate ends. /copyrighteous/images/scratchweb-bad_mako.png

/copyrighteous/images/scratchweb-bad_mako.png

/copyrighteous/images/scratchweb-looks_arent_everything.png

There's quite many more entertaining examples under the tag and many more elsewhere on the Scratch website although they are a little trickier to track down. Jonathan Zittrain likes to say that the best technologies are generative in the sense that they encourage their users to make things with them that the designer never forsaw or anticipated. I feel generative.

Next.

Previous.